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Optimal Taxation and Public Production 
II: Tax Rules 


By PETER A. DIAMOND AND JAMES A. MIRRLEES* 


In Part I of this paper which appeared 
in the March 1971 issue of this Review, we 
set out the problem of using taxation and 
government production to maximize a 
social welfare function. We derived the 
first-order conditions, and considered the 
argument for efficiency in aggregate pro- 
duction. Here in Part II we consider the 
structure of optimal taxes in more detail. 
Part I contained five sections, and Part II 
begins at Section VI. In the sixth and 
seventh sections we consider commodity 
taxation in one- and many-consumer econ- 
omies. In the eighth section we consider 
other kinds of taxes; and in the ninth, pub- 
lic consumption. In the tenth section we 
consider a rigorous treatment of the prob- 
lem, giving a sufficient condition for the 
validity of the first-order conditions. To 
begin, we shall restate the notation and 
basic problem. 


Notation 


{GREED 


ae ————] 


mand X(q)= J nx*(q) 


* Massachusetts Institute of Technology and Nuffield 
College, respectively. The remainder of the matching 
footnote in Part I is appropriate here too. 
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an restate the 
ciom that of selecting gto. 


With this notation before us again, we 


Maximize V (q) 
subject to G(X (q)) <0 


(33) 


mt. This problem gave rise to 
the first-order conditions ((19) and (22)) 
which were equivalently stated as 


oV Ox ; 
ees À i 


ae ee ett 


(34) ð 
-Ag (>) 
(k = 1,2,...,n) 


Equations (34) were derived only for k=2, 

. . , n. But we can see that they hold also 
for k=1; for, on multiplying by q, and 
adding, we have | 


by the homogeneity of degree 0 of V and 


the X;. Equation (34) states that thesin 
pact of a price rise on social welfare is pro- 
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in demand induced by the price rise. Al- 


ternatively thempactrofaytaxsnereaseron 


@ucedichangelinitaxevenue (all calculated 


at fixed producer prices). 


VI. Optimal Tax Structure— 
One-Consumer Economy 


For one consumer and an individualistic 
welfare function (so that V coincides with 
v, the indirect utility function of the only 
consumer in the economy), we can express 
directly the derivative of social welfare 
with respect to gx (v= —ax, where a is 
the marginal utility of income—see equa- 
tion (5) of Part I). For this case we can 
then explore the structure of taxation in 
more detail. The formulation of the first- 
order conditions using compensated de- 
mand derivatives is due to Paul Samuelson 
(1951). We begin by stating the familiar 
Slutsky equation: 

Ox; OX; 
= Sik — Ve 


35 
l ) ðqk ol 


where Sj, is the derivative of the compen- 
sated demand curve for z with respect to 
gx, and 0x;/dI is the derivative of the un- 
compensated demand with respect to in- 
come (evaluated at J=0 in our case). We 
shall make use of the well-known result 
that Sik = Ski- 

Substituting into the first-order condi- 
tions (34) we have: 


ô 
— ax, = — \—— ( È tims) 
Oly 


Ox; 
= d(x + Di) 


ðt; 
— AXk — À » LS ik 


(36) 


OX; 

+ Àx ti —— 

22 al 
en i rer |: 


Rearranging terms, we can write this in 
the form: 


Xk À 


The point to be noticed is that the right- 
hand side of this equation is independent 
of k. Call it —0. Finally, using the sym- 
metry of the Slutsky matrix, we write the 
first-order conditions as: 


>» Spiti 


a2 = — 9 
Xk 


(38) 


Multiplying by tx, and summing, we ob- 
tain 
(39) 0 2. LeXk — e > LiShait ; > 0, 

k k,i 
by the negative semi-definiteness of the 
Slutsky matrix. Thus @ has the same sign 
as net government revenue. 

The left-hand side of (38) is the per- 
centage change in the demand for good k 
that would result from the tax change if 
producer prices were constant, the con- 


- sumer were compensated so as to stay on 


the same indifference curve, and the deriv- 
atives of the compensated demand curves 
were constant at the same level as at the 
optimum point: 


ti Ox ti 
Ax, = 34 | a dt; = Ef Skidt; 
(40) t 0 1 2 0 


= Dy Ski 
i 


ti 
dt; = > Sil 


0 


In fact, it is not possible for all these deriv- 
atives to be constant. But 


We can also calculate the actual changes 
in demand arising from the tax structure 
(assuming price derivatives of demand and 
production prices are constant) by resub- 
stituting from the Slutsky equation (35). 
Then, upon substitution, we have: 
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OX. OX: 


re 


== 0X4 5 


Three-Good Economy 


In the case of a three-good economy, we 
can obtain an expression for the relative 
ad valorem tax rates of the two taxed 
goods. This argument is similar to that of 
W. J. Corlett and D. C. Hague, who dis- 
cussed the direction of movement away 
from proportional taxation that would in- 
crease utility. In the three-good case, with 
good one untaxed, the first-order condi- 
tions (38) become 

S2zt2 + S23t3 = — Ox» 


(42) 
Sgole + S33t3 = — 0x3 


Solving these equations we have 


(43) 4-0 > e 

522533 — Sog 529533 — S23 
Notice that the denominator here is posi- 
tive, by the properties of the Slutsky ma- 
trix. We convert these intoelasticity expres- 
sions, defining the elasticity of compen- 
sated demand by 


QySiz 


Xi 


(44) Ot = 
Equation (43) can then be written 


2 9 
— = 6"(a23 — 033), — = 0’ (032 — o22), 
q2 ae 


(45) 


OPTIMAL TAXATION 263 


G293(S22533 = 553) 


We now substitute for cx and o33, using 
the adding-up properties of compensated 
elasticities, 

023 = — 022 — 021, 


(46) 


032 = — 033 — 031 


This gives us 


t 

= = 0’ (a2; + 022 F T33), 
anoo oy 

-i _ 0’ (031 ES O22 + 033) 

q3 


"LI 
goods 2 and 3 are consumer goods (x:>0, 


C20) Then 


(@overmmiecntirevenue. For definiteness, sup- 


pose that government revenue is positive 
so that 6’>0. Equation (47) shows that 


to 


> h . 
(48) — = — according as oa = 31 
q2 < qs > 


@abor (It is possible that one commodity is 


subsidized, but it has to be the one with 
the greater cross-elasticity.) 


Examples 


The implications of the above model are 
very diverse, depending upon the nature 
of the demand functions. A simple example 
will show how the theory can be used. If 
we define ordinary demand elasticities by 
the usual formula 


—1 Ox; 
Eik = (kYi 


(49) 


) 


dqr 


we can rewrite the optimal taxation form- 
ula in the form 


=j 
(50) v = qk À > PiXi€ix 
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When the welfare function is individualis- 
tic, equation (5) applies, so that equation 
(50) may be written as 


(51) — agux, = À > PiXi€ik 


or 


ffect other demands (implying a unitary 
covnpneeeiastinlty equation (51) simpli- 
es to yleld the optimal tax of that good: 


If Eik = 0 (i Æ k) and Ekk = — 1, 


(52) 


then gp = da 


where qp ™ equals one plus the percentage 
tax rate. Recalling that a is the marginal 
utility of income while A reflects the change 
in welfare from allowing a government 
deficit financed from some outside source, 
their ratio gives a marginal cost (in terms 
of ae numeraire 0) o anp revenue. 


example of a utility function 
exhibiting such demand curves is the 
Cobb-Douglas, where only Jabor is sup- 
plied. As an example consider: 


u(x) = bı log (xı + wı) + > b; log x; 
that the optimal tax structure is a pro- 
_ portional tax structure. 


It is easy to exhibit examples where the 
optimal tax structure is not proportional. 
Consider the example: 


u(x) = >. b; log (x; + w), 
5 bi = 1, WwW, Æ 0 


(53) 


(54) 


The demands arising from these prefer- 
ences are: 


THE AMERICAN ECONOMIC REVIEW 


(55) xi = q7r'bi D) qiw; — wi 


Therefore the demand elasticities are: 


Eik = ees (k Æ t) 
(56) . | 
Ekk = — bexi 1 5 Wj — se 
j#k qk 


Substituting in the formula for the optimal 
taxes, 


(57) Akk = 
a 2 Pa = = A D 0 | 
J#k qi Gk jzk 
= A 2 E 1) k Pile E brw; Pe) 
qi dk 


Since the assumption >> 6;=1 allows us 
to write the demand functions (55) in the 
form: 


(58) qaer = Do [brwsg; — biwege], 


F 


we can deduce from (57) and (58) that 


pi a 
pfa (2-3) 
j qi A 


(59) 


These equations allow us to calculate p 
for any given q, and in that way give the 
optimal taxation rules. In general, taxes 
will not be proportional. As one example 
of this, consider the following three-good 
case. 


Sample Calculation 


Let us combine the above two examples 
by considering a three-good economy 
(one-consumer good and two types of 
labor) with preferences as in (54). This 
example will be used to show that limited 
tax possibilities (represented by the same 
proportional tax on goods 2 and 3) intro- 
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duces the desirability of aggregate produc- 
tion inefficiency. 


Example e. Assume that preferences satisfy 
(60a) u = 
log x1 + log (x2 + 1) + log (x3 + 2) 
xı > 0, x > 1, xz > — 2; 
while private production possibilities are 


(60b) V1 os Vo + V3 <S 0, 
yı 2 0, yo SO, yz SO; 


and the government constraint is 
(60c) 1.022; +2. < 0 
220, 250, 2385 —0.1 


Thus the government needs good 3 for 
public use and can produce good 1 from 
good 2, but only less efficiently than the 
private sector can. 

Since we know that production eff- 
ciency is desired, we have 


qı = ~pi= po = 1, 


From the first-order conditions (59) and 
market clearance given the demands (58), 
we obtain two equations to determine q? 
and qs: 


21 = 


qe(qgz' — 1) = 2q3(gx* — 1) 
(q2 + 2q3)(qr' + g7! + 1) = 8.7 


These have a unique positive solution 


qz = 0.94494, q3 = 0.90008 
which give 
%1=0.9150, x.=—0.0316, x= —0.9834 
u= — 0.1045 


If we now require the same tax rate on 
goods 2 and 3 and at the same time im- 
pose production efficiency, then q.=q:=@, 
and the tax rate is determined by the 
market clearance equation. We obtain 


3g +6=8.7; ie,g = 0.9 


Then demands are 


xı = 0.9, X = 0, x; = — 1 


and 
u = — 0.1054 


Notice that the economy is still on the 
production frontier even though both 
input prices are lower in this case. If we 
introduce inefficiency with p:>1, so that 
y=0 and x.=22, we can increase utility. 
Market clearance now requires 


(go + 2q3)((1.02)-1q7! + q7! + 1) = 8.7 


At prices q= .92, q= .90008 for example, 
we have, xı=0.9067, x,=—0.0144, x 
= — 0.9926, and u= — 0.1051. 


VII. Optimal Tax Structure— 
Many-Consumer Economy 


As we noted in Section III of Part I, 
the equations for optimal taxation with a 
single consumer which do not reflect the 
particular form of V are also valid for 
many consumers. To pursue the analysis 
further, we must find an expression for 
V, the derivative of social welfare with 
respect to the kth consumer price. 

With an individualistic welfare function, 
we have 


(61) V = WAD, D, og) 2 


Differentiating with respect to q, we 
obtain 


(63) sh = —— a! 


is the increase in social welfare from a unit 
increase in the income of consumer 4. We 
have 


h h 
(64) —Vi = DB tr, 
h 
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or the derivative of welfare with respect to 
a price equals the “‘welfare-weighted”’ net 


consumer demand for commodity k Dhe 


necessary condition for optimal taxation 
makes V, proportional to the marginal 
contribution to tax revenue from raising — 
the tax on good k. 


h h 
65 = \-—» 
( ) >) Bx: ai 


where T= >, #;X; is total tax revenue, 
and the derivative is evaluated at constant 
producer prices (i.e., on the basis of con- 
sumer excess demand functions alone). 
We also have the alternative formula 


OX; 
(66) > Be = — ND pa 
h r dqr 


Example f. Before turning to interpreta- 
tions of the optimal tax formulae like those 
above, let us consider an example. 

We will assume that each consumer has 
a Cobb-Douglas utility function, 


(67) u =b, log (x1 +w) 


+E bloge, Eb =l 
2 1 

Choosing good 1 as numeraire, we saw in 
Section VI that with a one-consumer 
economy, taxation would be proportional. 
This will not, in general, be true in a 
many-consumer economy where each con- 
sumer has this utility function. The indi- 
vidual demand curves arising from this 
utility function are: 


x = gi, bigue , i= 2,3, sA 
(68) c 
t= — (1 — bijo 


Noticethatdx}/dq,=0(k4#i# 1) anddx}/dq; 
= —x3/9: (741). 

Assuming an individualistic welfare 
function, the first-order conditions (66) 
are in this case 


hh a4 h 
x, =X x 
(69) 28 Xk Pkk 2 k 


(k= 2,...,%n) 


This implies the following formula: 


Qk » tr 


(k= 2,...,n) 


To complete the determination of the 
optimal taxes, we must find the relation- 
ship between X, pı, and g:. This is obtained 
from the Walras identity. The value of net 
consumer demand in producer prices is 
equal to minus the profit in production. 
(Alternatively, we could determine ^ so 
that the government budget is balanced.) 
That is 


=p. Geta 
h 


(71) i lh h 

+>) Dd pqi bigw = y, 
w=2 h 

where y is the maximized profit of produc- 

tion net of government needs (= >)”; piz:). 

Substituting from (70) and rearranging, 

we obtain 


Y(t = bio + yp. 


I 


Dd = bdo tya 


De BML — bia" 


The number yp is determined by the 
technology and the government expendi- 
ture decision, and therefore depends on 
p (unless y=0). 

Equations (70) and (72) determine the 


optimal tax rates. If the social marginal 


utilities, 8’, are independent of taxation, 
the optimal tax rates can be read off at 
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once. This is true if W has the special form 
> >, vh; for in that case B'=1/w*. It should 
be noticed that, although each household’s 
social marginal utility of income is un- 


affected by taxation, it is aeons to 
have taxation in general. r 


Al- 


ben 
i ser i , 


In general, taxation does affect social 
marginal utilities of income. The 8* depend 
on the tax rates, and equations (70) do 
not, therefore, give explicit formulae for 
the optimum taxes. In the case 
W=—p" >>, e™*, u>0, so that there is a 
stronger bias toward equality than in the 
additive case, it can be verified quite easily 
that the optimum taxes have to satisfy 


oe b “wi es 
73) oe ( “Te 


=À 5 be 
h 


In this case, marginal utilities of income 
are brought closer together! It is not 
immediately obvious from the equations 
(10) that the g are determined given the p. 
However, it can be shown that, in the 
present example, the first-order conditions 
must have a unique solution.” In fact, the 


1Tf w<0O, utilities and marginal utilities are moved 
further apart. 

2? It is easily verified that v'=68,+ $: b: log (q/q:), 
where the 6, are constants. Consequently 


h 
Vig) = — wd es TT (q/q): 
h 


which is a concave function of (q1/q2, qi/q2, - . - , q1/qn). 
Also, aggregate demand is 
Xil) = È bwt: (q/q), XQ = — È (1 — ator 
h h 


If the production set is convex, the set of (q/q... , 
qı/qn) for which (Xi, Xo, ..., Xn) is feasible is also con- 
vex. Thus the optimum q is obtained by maximizing a 


relations (70) (along with (72)) would, if 
followed by government, certainly lead to 
maximum welfare if production were 
perfectly competitive, since any state of 
the economy satisfying these conditions 
maximizes welfare, and the maximum is 
unique for the welfare function considered. 
Unfortunately this convenient property 
is not general. 

From equation (70) we can identify two 


tional. If the social marginal utility of © 
in (B"=8, 
for all 4), then equation (70) reduces to 
qe pe =X/B. 


individuals. Thus the optimal tax formula 


The second case leading to proportional 
taxation occurs when demand vectors are 
proportional for all individuals, ah = Witsellsindi 
and thus b=), for all h. 


Optimal Tax Formulae 


The description in Section VI of some 
possible interpretations of the optimal tax 
formula carries over to the many-con- 
sumer case. Thus, as was true there con- 


concave function of (q:/q2,---, qı/qn) over a convex 
set, and is therefore uniquely defined by the first-order 
conditions. 
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sumer price elasticities but not producer 
price elasticities enter the equations, and 
at the optimum the social marginal utility 
of a price change is proportional to the 
marginal change in tax revenue from rais- 
ing that tax, calculated at constant pro- 
ducer prices. Analysis of the change in 
demand can also be carried out, but is 
naturally more complicated. Assuming an 
individualistic welfare function, the first- 
order conditions can be written? 


h 
Ox; 
ma See E 
i a O79k h 


From the Slutsky equation, we know that 


Ox; Ox; Ox; 
TT = Stk — Xe = Ski he -T 
(75) 
Ox, Ox; OX, 


Xk XxX 
qi ol ol 
Substituting from (75) in (74) we can 
write the optimal tax formula as equation 


(76). Rearranging terms we can write 
equation (76) as (77). 


= is concentrated among: 


3 We neglect the possibility of a free good when the 
first-order condition would be an inequality. 


(76) LEERS C 


OX; l 
D2 D p'a 
Gy = e a 
2 te A dex 
h h 


s, equation (77) gives the) 


(1) individuals with low social marginal 


u ) 


(2) 
individuals with small decreases in 


good k and taxes paid are large. — 


VIII. Other Taxes 


Thus far we have examined the com- 
bined use of public production and com- 
modity taxation as control variables. It is 
natural to reexamine the analysis when 
additional tax variables are included in 
those controlled by the government. In 
particular, in the next subsection we will 
briefly consider income taxation; but 


first, let us examine a general class of taxes 


epen 


We shall replace the budget 
constraint > >gixi=0 by the more general 
constraint $(x, q, ¢) =0, where ¢ represents 
a shift parameter to reflect the choice 
among different systems of additional 
taxation (for example, the degree of pro- 
gression in the income tax). Let us note 
that this formulation continues to assumens 


to permit an exten- 


sion of the analysis above is an depen 


We need to assume that thegchoicejofitax’> 
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Income Taxation 


i Nothing that we have said suggests that 
n P a T SOR? 


particular, this formulation impliesithatp tO The analysis has only considered 


Thus 


to fit this formulation gmeedsstombelléevied > 


O 


imilarly 


it is assumed that 


We know already thate 
e may 
therefore concentrate upon the case in 
which all production is controlled by the 
government, and the production constraint 
is that 21=g(%2, %3,..., Xn). We have to 


choose qo, q3, ..., Qn, Ẹ to 
(78) maximize V(q, ¢) subject to X,(q, ¢) 
ve g(X (q, c) ee , Xalq, ¢)) 


As before we introduce a Lagrange multi- 
plier A. Differentiation with respect to q, 
yields the familiar 


OX; 
(79) Ve =) DP T) 
è ðqk 


where the producer price p; is 0g/0x; 
(i=2, 3,..., n), and þpı=1. Differentia- 
tion with respect to the new tax variable 
provides the similar equation | 


OV OX; 
80 Se ee À ae 
(80) 7 2 ees 
We have an alternative form for (79), 
namely, 


oT 
Oly 


In exactly the same way, we obtain from 
(80) a formula involving the effect of the 
new tax on total tax revenue, 


(82) A? p 


the best use of commodity_taxation. It is 
natural to go on to ask oD 


j The formulation of income taxation 


aea 


4 
A z 6 D f t 


qoes no 


\/ J K 


. This eliminates 


macai dr problem, but Remp Sum 
l beri mae 
tools available-in a large economy. The- 


M. (This approach is taken by 


Mirrlees.) 


} 


O ernatl 


If only commodity taxation is possible, 
the tax paid by a household that pur- 
chases a vector x" is 


Pa S tix; 


To add income taxation to the tax struc- 
ture, we can select a subset of commodities, 
L, e.g., labor services, and tax the value of 
transactions on this subset, so that 


I= D qai 


tin L 


(83) 


where J is “taxable income.” Then 


(84) T’ = D tis: +r, t), 


where 7 is a fixed continuously differenti- 
able function depending on a parameter ¢, 
and is the same for all consumers. With a 
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tax on services (x; negative) we would 
expect 7 to be decreasing in its tax base, 
with a derivative between zero and minus 
one. In terms of the notation employed 
above, we can define the budget constraint 


p(x", q, £) by 
plx", q, $) 
= > pitt; EJ T" 


= Dont + rf 


(85) 


23 qiti, 


iin L 


:) 
Here we can regard g and ¢ as the policy 


variables. Thus 
The first-order —. for optimal 


income taxation are just the conditions 
(79) and (80), interpreted for this special 
case. 


. In the case 
of an an aot ee saare function, we 
can give more explicit formulae for the 
welfare derivatives, V, and V+: 


i ðr? 
(86) Vi = bx (1 + ôr =) 
h ol 


Or? 


2d, Bi —— 


h 0g 


where 6,=1 if k is in L, 0 if k is not in L; 
and rt=r(I*, ¢). 

These equations are derived from the 
first-order conditions for maximizing u’ 
subject to ¢=0, noticing that, for example, 
the budget constraint implies that 


87) Vee 


T Ox, ðt ato 
Combining (82) and (87), we obtain 

ðr? oT 

88 aera ees 

(88) 26 T T 


This content downloaded from 91.238.114. 


THE AMERICAN ECONOMIC REVIEW 


Thus, 


IX. Public Consumption 


From the star weuhaveleonsideredmthe 


» T 


S 
therefore included in the model (and 


uae 1S ae and was assumed to 


keep as uncluttered as possible a naturally 
complicated problem. We can oWo 


sider a choice among vectors of public con- 
sumption which affect social welfare di- 
GE (We shal aaaea  - 

ment controls all production, thus ignoring 


Let us denote by e the vector of public 
consumption expenditures. (Items of pub- 
lic consumption which are difficult to 
measure can be described by the inputs 
into their production.) The presence of 
public consumption alters our problem in 
three ways. 


First qqpublicysconsump tion» 


us market clearance becomes 


tion affects private net demand 
must now be written X(q, e). Third, Gh® 


(by 
affecting individual utility in the case of 
an individualistic welfare function). 

We can restate the basic maximization 
problem as 
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(89) Maximize V(q, e) theory and that of public goods in the 
i presence of lump sum taxation (as devel- 

subject to G(X (q, e) +e) $0 oped, for example, by Samuelson (1954)). 

Because social marginal utilities of income 
are not equated, the expression dV /ðex 
cannot be reduced to a sum of marginal 
rates of substitution, but depends on the 
weights given to the different beneficiaries 


The presence of e in the problem will not 
affect the equations obtained by differ- 


entiating a Lagrangian expression with 
respect to q. W Seat 
of public consumption: 


tive bundles of public consumption does 
OV oW ðu’ 


n ðu” de 


structure. Nor would we expect it to affect (93) uae 
Oe; 
ep. We can there- 


fore replace the inequality in (89) with an 
equality and differentiate the Lagrangian 
expression with respect to e;: 


| 
M 
pa 

| 

| 

| 
M 
9 


ð 
Pa > giX i — > t;X;) 
der 


ð 
moe | > LAG) 
Oe; 
we can write (90) as 
OV ð 
(92) — = —~\—() LX) + dG, 
Oe; der 


Equations (92) show how the optimal 
a the E contribution of public 


measured by 


(measured by 


0 >) t:X,/de,); and 


(iii) the direct cost of public con- 


There are three differences between this 


Second, the cost associated with the rais- 
ing of government revenue implies that 
the impact of public consumption on 
revenue is a relevant part of the first-order 
conditions. Third, for the same reason, the 
cost of public consumption is measured in 
terms of the cost to the government of 
raising revenue to finance the expendi- 
tures (in terms of the one-consumer equa- 
tion, \ may not be equal to a, the marginal 
utility of income). | 

The first-order conditions for the provi- 
sion of public goods can be expressed in 
another way, showing the relationships 
between the marginal cost and ‘‘willingness 
to pay.” Write z} for the marginal rate 
of substitution between public good k and 
income for the kth household. Then 
Ou" /de,= ar, where a is the hth house- 
hold’s marginal utility of income. The 
social marginal utility of the Ath house- 
hold’s income, 8", is (OW /du")a*. Conse- 
quently, from (93) 


OV bh 
(94) — = Bn 


Then, from (92) 


ho 3 
(95) G, = 5 K i + — 2 z 
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worth” of the household’s income, and 
adjusted for the effect of the level of pro- 
In the discussion of public consumption 
thus far it has been assumed that there 
were no possible fees associated with the 
provision of public goods. This would be 
appropriate for national defense or preven- 
tive medicine, but not for goods where li- 


censes can be required from users. The 
optimal level of license fees will not, in gen- 


eral, be zero. Indeed qvemmaysberablestowas> 
sociate with any good more complicated 
pricing mechanisms than the single fixed - 
@riceleonsidered Above Mn particular, there 


are the familiar examples of@wozpartitarifisp 


ich was 

in. 
mably the introduction of more 
general pricing and taxing schemes gives 
an opportunity for increasing social wel- 
fare, just as the progressive income tax 


gives such an opportunity@iingpracticeythe 
li E | t } of complicate ] 
pricing schemes which can increase wel- 
We would expect the analysis done 


above to be basically unchanged by the 
addition of these possibilities, although a 


4 Another case can be treated in a similar manner: 
that of limited government production of a good, which 
is also being produced privately, when government pro- 
duction is given away rather than being sold. Since the 
government production rule given above does not re- 
duce to the first-order condition in producer prices, we 
would not find aggregate production efficiency for the 
sum of these two sources of production. 


two-part tariff will cause aggregate de- 
mand to have discontinuities. In practice 
we would expect these discontinuities to 
be small relative to aggregate demand, and 
formally, they could be eliminated by the 
device of a continuum of consumers. 


X. The Optimal Taxation Theorem 


In the earlier discussion, we employed 
calculus techniques to obtain the first- 
order conditions for the optimal tax struc- 
ture. However, the valid use of Lagrange 
multipliers is subject to certain restric- 
tions, which in the present case have no 
very obvious economic significance. This 
section provides a rigorous analysis of 
conditions under which the tax formulae 
(34) are indeed necessary conditions for 
optimality, and in particular provides 
economically meaningful assumptions that 
ensure their validity. The reader should be 
warned that the discussion is highly 
technical. 

One might hope to provide a rigorous 
analysis by using the well-known Kuhn- 
Tucker theorem for differentiable (not 
necessarily concave) functions. This the- 
orem requires a certain ‘‘constraint qualifi- 
cation” to be satisfied. Let us apply it and 
see how far we get. We wish to 


Maximize V (q) 
subject to g(X(qg)) SO and q 2 0, 


where g is a (vector) production constraint 
such that g(X) <0 if, and only if, X is in 
G. Given that V, X, and g are differentia- 
ble, and that the Kuhn-Tucker constraint 
qualification is satisfied, we have the first- 
order conditions 


r 


“ 


OV ðX , 
(96) VQ") = -— Spo = p: X q”), 
Og Og 


where p=dA-g’(X(qg*)) for a vector of 
Lagrange multipliers A, and is therefore 


a support or tangent hyperplane to G 
at X(g*). Since V and X are homogeneous 
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of degree zero, [V’(g*)— p-X’(q*) |-g*¥=0: 
consequently 0V/dg:= p-(0X/dq:) for 1 
such that ¢,.*>0. 

To express the first-order conditions in 
this form, we naturally expect to assume 
that V and X are continuously differentia- 
ble: to that extent, the differentiability 
assumptions are innocuous. The assump- 
tion that the production set can be de- 
scribed by a finite number of contin- 
uously differentiable inequality constraints 
that satisfy the constraint qualification is 
less satisfactory. The constraint qualifica- 
tion is an assumption about the functions 
g: one can violate it by changing the func- 
tions g without changing the actual con- 
straint set, G. Some such assumption is 
required to avoid not unreasonable 
counter-examples, as we shall see below. 
But it is not at all obvious how one would 
check whether a particular example that 
failed to satisfy the constraint qualifica- 
tion could be put right by describing G by 
a better behaved set of inequalities. We 
should like to use a constraint qualification 
that depends on the properties of the set 
G (and X) rather than the particular 
functions g; and we should like the as- 
sumption to be more amenable to eco- 
nomic interpretation. The theorem we 
prove below contains such an assumption, 
for the case where G is convex and has an 
interior. 

Before stating the theorem let us con- 
sider an example in which the first-order 
conditions are not satisfied at the opti- 
mum. 


Example g. Consider the one-consumer 
economy. In the case shown in Figure 10, 
the offer curve is tangent to the production 
frontier at the optimum production point. 
As q varies, the vector X(q) traces out the 
offer curve. Thus, holding q, constant, the 
vector 0X(q)/dg, is tangent to the offer 
curve at X(g*). Therefore if p is the vector 
of producer prices, which is tangent to the 


production frontier at X(g*), p-dX(q*)/ 
ðqı=0. The same is true for the derivatives 
with respect to go. But there is no reason 
why V’(g*) should be zero: therefore the 
above first-order conditions may not be 
satisfied at the optimum. 


good 2 


O good | 


FIGURE 10 


We shall make an assumption ruling out 
tangency between the frontier of the pro- 
duction set and the offer curve: 


For any $, q (¢20, p#0) such that 
X(q) isin G and p-X(q)2p-x for all x 
in G, p-X’(q) 20. 


The qualification takes this particular form 
because we also have the constraint q Z0. 
Let us note that for q>0 the condition 
p-X(q)20 is equivalent to p-X’(g) <0, 
because X is homogeneous of degree zero. 
The qualification asserts that for any 
possible competitive equilibrium (under com- 
modity taxation) there 1s a consumer price 
change which will decrease the value of 
equilibrium demand, measured in producer 
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prices. By the aggregate consumer budget 
constraint, q- X =(p+t)-X=0. Therefore 
the assumption says that at any possible 
equilibrium point on the production fron- 
tier, it is possible to increase tax revenue. 
Thus the first-order conditions may not 
be applicable if the optimal point repre- 
sents a local tax revenue maximum. Re- 
turning to example g, we see that p-X’=0 
at the optimum, or equivalently 0(¢-X)/ 
dt=0, although the derivatives of V are 
not necessarily zero there. 

We now state and prove the theorem.’ 


THEOREM 5: Assume an optimum, (X*, 
_ q*) exists; that V(q) and X(q) are continu- 


PROOF: 

Let P={|p|p-X*2=p-x, all x in G}. 
P is the cone of normals to G at X*, in- 
cluding the zero vector. It is a nonempty, 
closed, convex cone. 


5It should be noticed that when the constrained 
optimum is (locally) an unconstrained maximum, the 
producer prices satisfying the theorem are zero. This 
happens if optimal production is in the interior of the 
production set and may happen if it is on the frontier. 
The theorem can be weakened in a complicated manner 
by replacing the nontangency qualification by two con- 
ditions. One is an analog of the Kuhn-Tucker Constraint 
Qualification providing for the existence of an arc in the 
attainable set. The other use of nontangency occurs 
when V’ is in B but not in B. If it is assumed that when 
there is tangency, the cone of normals is polyhedral, B 
will be closed. The Kuhn-Tucker theorem is then a 
special case of the weakened version of theorem 5 when 
G is the nonnegative orthant. The Kuhn-Tucker 
theorem is very much easier to prove, however. 


We write V’ for V’(q*) and X’ for X’(q*). 
Consider the set 


B = {v| v < p-X’, some p in P} 


We have to show that V’ isin B. We do this 
by showing first, that if V’ is in B, the 
closure of B, in fact V’ is in B; and then 
that V’ must be in B. 

If V’ isin B, there exist sequences fun} 
and {pn}, bp, in P, such that 


Un = bn? X’, 


98 
98) Un > V’ 


(n => œ) 
Either { p,} is bounded or it is not. If not, 
we can find a subsequence on which 


|| Pall > œ, 


Then, dividing (98) by ||p,|| and letting 
n—c on the subsequence, we obtain 
$- X’ 20 while $, #0, isin P. This possibil- 
ity is excluded by assumption (97). There- 
fore {pa} is bounded, and has a limit point 
p, in P. Equation (98) implies that 
V'<p-X’. The conclusion of the theorem 
is thus established on the assumption that 
V’ is in B. 

Suppose, on the contrary, that V’ is not 
in B. We shall derive a contradiction by a 
sequence of lemmas. 


LEMMA 5.1: 
B is pointed. That is, v and —v both 
belong to B only if v=0. 


PROOF: 
If v, —v is in B, we have sequences such 
that 


2 2 
/ 
Vi, = Pr ZX 5 


9 


“ 


p= 


(99) S pa X', 


1 
iD, 


(100) 


If v0, it cannot be the case that p} and 
p2 both tend to zero. Suppose, for example, 
p, does not, and take a subsequence on 


which 
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[all +m s ©, 


pn/|| Pall +p, #0 


If patpi, p3/lpr|—>— p, and there- 
fore — p' is in P. This is impossible, since, 
G having a nonempty interior, P is pointed. 
(If p, —p are in P, p-x is constant for x 
in G, but a hyperplane has no interior.) 


We can therefore take a subsequence on 
which 


[pn + Pall > z, O<rso, 
Path, uo ep 
|p +2 pll oe 


From (99) (adding and dividing by 
|p, + l|) and (100), we now have 
pe X' > Li m Ho 
.X' > Lim ———_— 
|e. + 2% 
= 0 


(101) 


This contradicts (97), since p is in P and 
px0, and thereby establishes the lemma. 


LEMMA 5.2: If C is a pointed, closed, 
convex cone, there exists a vector p such that 
for all non-zero z in C, p-2<0. 


PROOF: 

By the duality theorem for convex cones 
C+t+=C, where C+ is the dual cone, 
{p| p-zS0, z is in C}. Clearly, if C+ is 
pointed, C has a nonempty interior: for if 
interior C is empty, p-z=0 for some non- 
zero p and all z in C, and then p and —p 
both belong to C+. Under the assumptions 
of the theorem, C is closed and pointed. 
Therefore C++ is pointed, and Ct has an 
interior point p. 


p-z <0 (all nonzero z in C) 
Otherwise, if p-z=0, we can easily find 


a sequence {,} on which pa>p and 
pr-2>O0, so that p, is not in Ct. 


LEMMA 5.3: If V’ is not in B, there exists 


r such that 

(102) Vier >0 

(103) ver <0 (v E B) 
PROOF: 


The closed convex cone B+ {AV'|A SO} 
is pointed. Thus there exists an r such that 


ver HAV’ r <0 
(v E B, A < 0, v, A not both zero) 


Putting v=0 and A= — 1 we obtain (102); 
putting `=0 we obtain (103). 


LEMMA 5.4: Let r be a vector satisfying 
(102) and (103). For some ô>0, 


(104) X(* +0) EG OL0< 8) 


PROOF: 
Assume not. Then for some sequence 
{0,}, 0a>0, 0—0, 


X(q* + Orr) EG 
Since G is convex, this implies that 
À 
X(q*) + [X*+ Oar) — X] E G 
for \290,. Letting n—>æ, we deduce, for 
any A>0, that 
X(q*) + AX or 
X(q* +6nr) — X(q* 
= Lim| X0”) PE aa 4 7 - 


rn © 


is not in the interior of G. It follows that 
the half-line { X(q*)+2X"-r|A>0} can be 
separated from the interior of G by a 
hyperplane with normal p+0: 


PXG bap Xr B pew 
(\ > 0, « € Int G) 


Letting \—0 we have pCP. Letting 
x—X* we have 


p:-X'-r = O, 


which contradicts (103) since p-X’ is in 
B. The lemma is proved. 
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Since g* is optimal, (104) implies that problem may arise even when the produc- 


tion set is convex @Dherelisinolreason why > 
Vr to) SVQ") S659) the demand functions should have any of 
the nice convexity properties which ensure 


Therefore, a . 
MiZation. Only in partic ; 


1 

pap rats 9 [V(q* + 8r) — V(q*)] such as that discussed in footnote 2 above 
(where rigorous argument is possible with- 
=0 out appeal to theorem 5), will the first- 
This, however, contradicts (102). The order conditions lead to a unique solution. 
hypothesis of Lemma 5.3, that V’CB, is 
therefore false. The proof of the theorem is 

thus complete. 


In reaching our results that the first- 
order conditions for optimum taxes (96) 
hold in general, we have assumed that the 


production set, G, is convex. But erone 


m 


pae This is not a question we are . Similarly, in the 
present case, 


primarily concerned with in this paper. 
However, some extensions ¢ 
. AS an example, assume the fron- 
tier of G is differentiable at X*, so that p 
can be uniquely defined as the normal at 
X* and that G is not thin in the neighbor- 
hood of X*—i.e., there exists a ball with 
center on the normal through X*, con- 
tained in G and containing X*. Applying 
the theorem to this ball we get the validity 
of the first-order conditions (96) using the XI. Concluding Remarks 
producer prices defined by the normal. 

As in general welfare economics, two 
uniqueness problems may arise when con- 
sidering the application of the first-order 
conditions to achieve an optimum. In the 
first place, there may be more than one 
pair of price vectors, (p, q), that satisfy 
the first-order conditions and allow 
markets to be cleared. This is similar to 
the problem that arises when we attempt though 
to define optimum production and dis- | 
tribution by first-order conditions in the 
presence of a non-convex production set. 
It is noteworthy that, if lump sum trans- 6 For a discussion of multiple equilibria in a related 
fers are excluded as a feasible policy, this problem, see E. Foster and H. Sonnenschein. 


Welfare economics has usually been 
concerned with characterizing the best of 
attainable worlds, accepting only the 
basic technological constraints. As econ- 
omists have been aware, 


great insight into these problems has 
certainly been acquired. We have not at- 
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tempted to come directly to grips with the 
problem of incorporating these complica- 
tions into economic theory. Instead, “we 
have explored the implications of viewing 
these constraints as limits on the set of 
policy tools that can be applied. There 
are many sets of policy tools which might 
be examined in this way. Specifically, we 
have assumed that the policy tools avail® 
able to the government include commodity 
taxation (and subsidization) to any extent. 
For these tools we have deriveditheirules 
for optimal tax policy and have shown the 
desirability of aggregate production effi- 
ciency, the presence of optimal taxation. 
We have also considered expansion of 
the set of policy tools in such a way that 
we continuestoshavemthesconditionmthat 
production decisions do not change the 
class of possible budget constraints. For 
example, this condition is(still)preserved 
when one includes poll taxes, progressive 
income taxation, regional differences in 
taxation, taxation on transactions between 
consumers, and most kinds of rationing. 
This type of expansion of the set of policy 
tools does not alter the desirability of 
production efficiency, nor does it alter the 
conditions for the optimal commodity tax 
structure, although in general the tax rates 
themselves will change. ‘We have, un- 
fortunately, ignored the costofadminister- 
ing taxes. Presumably optimization by 
means of sets of policy tools that do not, 
because the cost of administration, include 
the full scope of commodity taxation, will 
not lead to the same conclusions. 

Let us briefly consider the type of policy 
implications that are raised by our anal- 


ysis. In the context of a planned economy: 


our analysis implies the desirability of 
using a single price vector in all production 
decisions, although these prices will, in 
general, differ from the prices at which 
commodities are sold to consumers. 

As an application of this analysis to a 
mixed economy, let us briefly examine the 


discussion of a proper criterion for public 
investment decisions. As has been widely 
noted, there are considerable differences in 
western economies between the inter- 
temporal marginal rates of transformation 
and substitution. This has been the basis 
of analyses leading to investment criteria 
which would imply aggregate production 
inefficiency because they employ an inter- 
est rate for determining the margins of 
public production which differs from the 
private marginal rate of transformation. 
One argument used against these criteria 
is that the government, recognizing the 
divergence between rates of transforma- 
tion and substitution, should use its power 
to achieve the full Pareto optimum, bring- 
ing these_rates_into equality. When this is 
done, the single interest rate then existing 
will be the appropriate rate to use in 
public investment decisions. We begin by 
presuming that the government does not 
have the power to achieve any Pareto opti- 
Mumethatwitwchooses. Then from the 
maximization of a social welfare function, 
we argued that ¢hegovernmentiwillyingen= 
eral, prefer one of the non-Pareto optima 
to the Pareto optima, if any, that can be 
achieved. At the constrained optimum, 
which is the social welfare function maxi- 
mizing position of the economy for the 
available policy tools, we saw that the 
economy will still be characterized by a 
divergence between marginal rates of 
substitution and transformation, not just 
intertemporally, but also elsewhere, e.g., 
in the choice between leisure and goods: 
However, we concluded that ið this situa- 
tion we desired aggregate production effi- 
Gieney. This implies the “use of interest 
rates for public investment decisions which 
equate public and private marginal rates 
of transformation. 

We have obtained the first-order condi- 
tions for public production, but we have 
not considered the correct method of 
evaluating indivisible investments. This 
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is one problem that deserves examination.  MiumllwithMemcentiprodiction ii pure» 


In examining the optimal tax structure, we We hope, nevertheless, 


have briefly considered the tax rates im- that the methods and results of this paper 
plied by particular utility functions. Phi8S have shown that economic analysis need 
analysis should be extended to more not depend on the simplifying, but un- 
realistic, assumption that the perfect 

Gumers. Further, capital levy has taken place.’ 
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